Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Context. Several Research areas emerged and have been proceeding independently when in fact they have much in common. These include: mutant subsumption and mutant set minimization; relative correctness and the semantic definition of faults; differentiator sets and their application to test diversity; generate-and--validate methods of program repair; test suite coverage metrics. Objective. Highlight their analogies, commonalities and overlaps; explore their potential for synergy and shared research goals; unify several disparate concepts around a minimal set of artifacts. Method. Introduce and analyze a minimal set of concepts that enable us to model these disparate research efforts, and explore how these models may enable us to share insights between different research directions, and advance their respective goals. Results. Capturing absolute (total and partial) correctness and relative (total and partial) correctness with a single concept: Detector sets. Using the same concept to quantify the effectiveness of test suites, and prove that the proposed measure satisfies appealing monotonicity properties. Using the measure of test suite effectiveness to model mutant set minimization as an optimization problem, characterized by an objective function and a constraint. Generalizing the concept of mutant subsumption using the concept of differentiator sets. Identifying analogies between detector sets and differentiator sets, and inferring relationships between subsumption and relative correctness. Conclusion. This paper does not aim to answer any pressing research question as much as it aims to raise research questions that use the insights gained from one research venue to gain a fresh perspective on a related research issue. mutant subsumption; mutant set minimization; relative correctness; absolute correctness; total correctness; partial correctness; program fault; program repair; differentiator set; detector set.more » « lessFree, publicly-accessible full text available January 1, 2026
-
Ghosh, Sudipto; Troubitsyna, Elena; Chen, Zhenyu (Ed.)When we quantify the effectiveness of a test suite by its mutation coverage, we are in fact equating test suite effectiveness with fault detection: to the extent that mutations are faithful proxies of actual faults, it is sensible to consider that the effectiveness of a test suite to kill mutants reflects its ability to detect faults. But there is another way to measure the effectiveness of a test suite: by its ability to expose the failures of an incorrect program. The relationship between failures and faults is tenuous at best: a fault is the adjudged or hypothesized cause of a failure. The same failure may be attributed to more than one fault. This raises the question: what is the relationship between detecting faults and exposing failures. In this paper, we discuss an empirical experiment in which we explore this relationship.more » « less
-
Rhode, Matilda; Simmons, Kent (Ed.)Function Extraction (FX) is a new and evolving paradigm for the production of secure computer codes. It is, in effect, the inverse of formal verification as it analyzes code to produce a mathematical specification of its behavior. This has the potential for identifying unwanted or unexpected behaviors and has the potential for analyzing unknown or “found” code artifacts such as malware. The effort is enabled by recent developments in loop analysis that allow invariant relations to be developed for loop bodies enabling the loop function to be discovered. The paper defines program behavior as a mathematical description of the effects of program execution on the environment in which the program runs and continues with a discussion of its current status. As FX is an evolving paradigm, areas in which work remains to be done are discussed and examples of the results of two prototype analyzers are given. The paper concludes with a discussion of the path forward and the work that remains to be done.more » « less
-
Abstract Invariant relations are used to analyze while loops; while their primary application is to derive the function of a loop, they can also be used to derive loop invariants, weakest preconditions, strongest postconditions, sufficient conditions of correctness, necessary conditions of correctness, and termination conditions of loops. In this paper we present two generic invariant relations that capture the semantics of loops whose loop body applies affine transformations on numeric variables.more » « less
-
Several metrics have been proposed in the past to quantify the effectiveness of a test suite; they are usually based on some measure of coverage because it is sensible to quantify the effectiveness of a test suite by the extent to which it exercises (covers) various syntactic features of the program under test. Though no coverage metric has emerged as the gold standard of test suite effectiveness, mutation coverage is widely perceived as a reliable measure of test suite effectiveness because the ability of a test suite to detect program mutations can be used as an indication of its ability to detect actual faults. In this paper we aim to challenge the superiority of mutation coverage, by showing that the same test suite may have vastly different values of mutation coverage depending on the mutation operators that are used in the estimation.more » « less
-
Since the dawn of programming, several developments in programming language design and programming methodology have been hailed as the end of the profession of programmer; they have all proven to be exaggerated rumors, to echo the words of Mark Twain. In this paper we ponder the question of whether the emergence of large language models finally realizes these prophecies.more » « less
-
We propose a set of functions that a user can invoke to analyze a program written in a C-like language: Assume() refers to a label in the source code or to a program part, and enables the user to make an assumption about the state of the program at some label or the function of some program part; Capture() refers to a label or a program part and returns an assertion about the state of the program at the label or the function of the program part; Verify() refers to a label or a program part and tests a unary assertion about the state of the program at the label or a binary assertion about the function of the program part; Establish() refers to a label or a program part and modifies the program code to make Verify() return TRUE at that label or program part, if it did not originally. We discuss the foundations of this tool as well as a preliminary implementation.more » « less
-
To repair a program does not mean to make it (absolutely) correct; it only means to make it more-correct than it was originally. This is not a mundane academic distinction: Given that programs typically have about a dozen faults per KLOC, it is important for program repair methods and tools to be designed in such a way that they map an incorrect program into a more-correct, albeit still potentially incorrect, program. Yet in the absence of a concept of relative correctness, many program repair methods and tools resort to approximations of absolute correctness; since these methods and tools are often validated against programs with a single fault, making them absolutely correct is indistinguishable from making them more-correct; this has contributed to conceal/ obscure the absence of (and the need for) relative correctness. In this paper we propose a theory of program repair based on a concept of relative correctness. We aspire to encourage researchers in program repair to explicitly specify what concept of relative correctness their method or tool is based upon; and to validate their method or tool by proving that it does enhance relative correctness, as defined.more » « less
-
To reduce the cost of mutation testing, researchers have sought to find minimal mutant sets. As an optimization problem, mutant set minimization is defined by two parameters: the objective function that we must optimize; and the constraint under which the optimization is carried out. Whereas the objective function of this optimization problem is clear (minimizing the cardinality of the mutant set), the constraint under which this optimization is attempted has not been clearly articulated in the literature. In this paper, we propose a formal definition of this constraint and discuss in what sense, and to what extent, published algorithms of mutant set minimization comply with this constraint.more » « less
An official website of the United States government

Full Text Available